Goto

Collaborating Authors

 nvidia research


EvoGraph: Hybrid Directed Graph Evolution toward Software 3.0

Costa, Igor, Baran, Christopher

arXiv.org Artificial Intelligence

We introduce **EvoGraph**, a framework that enables software systems to evolve their own source code, build pipelines, documentation, and tickets. EvoGraph represents every artefact in a typed directed graph, applies learned mutation operators driven by specialized small language models (SLMs), and selects survivors with a multi-objective fitness. On three benchmarks, EvoGraph fixes 83% of known security vulnerabilities, translates COBOL to Java with 93% functional equivalence (test verified), and maintains documentation freshness within two minutes. Experiments show a 40% latency reduction and a sevenfold drop in feature lead time compared with strong baselines. We extend our approach to **evoGraph**, leveraging language-specific SLMs for modernizing .NET, Lisp, CGI, ColdFusion, legacy Python, and C codebases, achieving 82-96% semantic equivalence across languages while reducing computational costs by 90% compared to large language models. EvoGraph's design responds to empirical failure modes in legacy modernization, such as implicit contracts, performance preservation, and integration evolution. Our results suggest a practical path toward Software 3.0, where systems adapt continuously yet remain under measurable control.


fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence

Williams, Francis, Huang, Jiahui, Swartz, Jonathan, Klár, Gergely, Thakkar, Vijay, Cong, Matthew, Ren, Xuanchi, Li, Ruilong, Fuji-Tsang, Clement, Fidler, Sanja, Sifakis, Eftychios, Museth, Ken

arXiv.org Artificial Intelligence

We present fVDB, a novel GPU-optimized framework for deep learning on large-scale 3D data. fVDB provides a complete set of differentiable primitives to build deep learning architectures for common tasks in 3D learning such as convolution, pooling, attention, ray-tracing, meshing, etc. fVDB simultaneously provides a much larger feature set (primitives and operators) than established frameworks with no loss in efficiency: our operators match or exceed the performance of other frameworks with narrower scope. Furthermore, fVDB can process datasets with much larger footprint and spatial resolution than prior works, while providing a competitive memory footprint on small inputs. To achieve this combination of versatility and performance, fVDB relies on a single novel VDB index grid acceleration structure paired with several key innovations including GPU accelerated sparse grid construction, convolution using tensorcores, fast ray tracing kernels using a Hierarchical Digital Differential Analyzer algorithm (HDDA), and jagged tensors. Our framework is fully integrated with PyTorch enabling interoperability with existing pipelines, and we demonstrate its effectiveness on a number of representative tasks such as large-scale point-cloud segmentation, high resolution 3D generative modeling, unbounded scale Neural Radiance Fields, and large-scale point cloud reconstruction.


PlaMo: Plan and Move in Rich 3D Physical Environments

Hallak, Assaf, Dalal, Gal, Tessler, Chen, Guo, Kelly, Mannor, Shie, Chechik, Gal

arXiv.org Artificial Intelligence

Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based controller. The path planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion, such as location, height, and speed. Complementing the planner, our control policy generates rich and realistic physical motion adhering to the plan. We demonstrate how the combination of both modules enables traversing complex landscapes in diverse forms while responding to real-time changes in the environment. Video: https://youtu.be/wWlqSQlRZ9M .


NVIDIA's latest AI model helps robots perform pen spinning tricks as well as humans

Engadget

The use for humans in the world of robotics, even as teachers, is shrinking thanks to AI. NVIDIA Research has announced the creation of Eureka, an AI agent powered by GPT-4 that has trained robots to perform tasks using reward algorithms. Notably, Eureka taught a robotic hand to do pen spinning tricks as well as a human can (honestly, as you can see in the YouTube video below, better than many of us). Eureka has also taught quadruped, dexterous hands, cobot arms and other robots to open drawers, use scissors, catch balls and nearly 30 different tasks. According to NVIDIA Research, the AI agent's trial and error-based reward programs are 80 percent more effective than those written by human experts. This shift meant the robots' performance also improved by over 50 percent.


nvidias-ediffi-diffusion-model-allows-painting-with-words-and-more

#artificialintelligence

Attempting to make precise compositions with latent diffusion generative image models such as Stable Diffusion can be like herding cats; the very same imaginative and interpretive powers that enable the system to create extraordinary detail and to summon up extraordinary images from relatively simple text-prompts is also difficult to turn off when you're looking for Photoshop-level control over an image generation. Now, a new approach from NVIDIA research, titled ensemble diffusion for images (eDiffi), uses a mixture of multiple embedding and interpretive methods (rather than the same method all the way through the pipeline) to allow for a far greater level of control over the generated content. 'Painting with words' is one of the two novel capabilities in NVIDIA's eDiffi diffusion model. Each daubed color represents a word from the prompt (see them appear on the left during generation), and the area color applied will consist only of that element. See source (official) video for more examples and better resolution at https://www.youtube.com/watch?v k6cOx9YjHJc Effectively this is'painting with masks', and reverses the inpainting paradigm in Stable Diffusion, which is based on fixing broken or unsatisfactory images, or extending images that could as well have been the desired size in the first place.


NVIDIA AI Research Helps Populate Virtual Worlds With 3D Objects

#artificialintelligence

The massive virtual worlds created by growing numbers of companies and creators could be more easily populated with a diverse array of 3D buildings, vehicles, characters and more -- thanks to a new AI model from NVIDIA Research. Trained using only 2D images, NVIDIA GET3D generates 3D shapes with high-fidelity textures and complex geometric details. These 3D objects are created in the same format used by popular graphics software applications, allowing users to immediately import their shapes into 3D renderers and game engines for further editing. The generated objects could be used in 3D representations of buildings, outdoor spaces or entire cities, designed for industries including gaming, robotics, architecture and social media. GET3D can generate a virtually unlimited number of 3D shapes based on the data it's trained on.


NVIDIA Research Lets Developers Improvise With 3D Objects

#artificialintelligence

Jazz is all about improvisation -- and NVIDIA is paying tribute to the genre with AI research that could one day enable graphics creators to improvise with 3D objects created in the time it takes to hold a jam session. The method, NVIDIA 3D MoMa, could empower architects, designers, concept artists and game developers to quickly import an object into a graphics engine to start working with it, modifying scale, changing the material or experimenting with different lighting effects. NVIDIA Research showcased this technology in a video celebrating jazz and its birthplace, New Orleans, where the paper behind 3D MoMa will be presented this week at the Conference on Computer Vision and Pattern Recognition. Inverse rendering, a technique to reconstruct a series of still photos into a 3D model of an object or scene, "has long been a holy grail unifying computer vision and computer graphics," said David Luebke, vice president of graphics research at NVIDIA. "By formulating every piece of the inverse rendering problem as a GPU-accelerated differentiable component, the NVIDIA 3D MoMa rendering pipeline uses the machinery of modern AI and the raw computational horsepower of NVIDIA GPUs to quickly produce 3D objects that creators can import, edit and extend without limitation in existing tools," he said.


NVIDIA's Instant NeRF: transforming 2D images into 3D scenes in record time - Actu IA

#artificialintelligence

Instant NeRF, a neural network-based technology capable of transforming a set of 2D photos into high-resolution 3D scenes in seconds, was introduced at an NVIDIA GTC session in March. According to the NVIDIA Research team, this would be one of the first models of its kind to combine ultra-fast neural network training and fast rendering. In its press release, NVIDIA recalls the technological revolution that Edwin Land brought on February 21, 1947 by producing an instant photo with a polaroid camera. NVIDIA Research pays tribute to him by recreating an iconic photo of Andy Warhol taking an instant photo, transforming it into a 3D scene using Instant NeRF. Artificial intelligence researchers at NVIDIA Research took the opposite approach with the goal of transforming a set of still images into a 3D digital scene in seconds.


NVIDIA Research: Tensors Are the Future of Deep Learning

#artificialintelligence

This post discusses tensor methods, how they are used in NVIDIA, and how they are central to the next generation of AI algorithms. Tensors, which generalize matrices to more than two dimensions, are everywhere in modern machine learning. From deep neural networks features to videos or fMRI data, the structure in these higher-order tensors is often crucial. Deep neural networks typically map between higher-order tensors. In fact, it is the ability of deep convolutional neural networks to preserve and leverage local structure that made the current levels of performance possible, along with large datasets and efficient hardware. Tensor methods enable you to preserve and leverage that structure further, for individual layers or whole networks.


NVIDIA Research: Tensors Are the Future of Deep Learning

#artificialintelligence

This post discusses tensor methods, how they are used in NVIDIA, and how they are central to the next generation of AI algorithms. Tensors, which generalize matrices to more than two dimensions, are everywhere in modern machine learning. From deep neural networks features to videos or fMRI data, the structure in these higher-order tensors is often crucial. Deep neural networks typically map between higher-order tensors. In fact, it is the ability of deep convolutional neural networks to preserve and leverage local structure that made the current levels of performance possible, along with large datasets and efficient hardware. Tensor methods enable you to preserve and leverage that structure further, for individual layers or whole networks.